<!DOCTYPE html>
<html lang="en">
<head>
  <title>jsoup Javadoc overview</title>
</head>
<body>
<h1>jsoup: Java HTML parser that makes sense of real-world HTML soup.</h1>

<p><b>jsoup</b> is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs
  and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.</p>

<p>jsoup implements the <a href="https://html.spec.whatwg.org/multipage/">WHATWG HTML</a> specification, and parses HTML to the same DOM
  as modern browsers do.</p>

<ul>
  <li>parse HTML from a URL, file, or string
  <li>find and extract data, using DOM traversal or CSS selectors
  <li>manipulate the HTML elements, attributes, and text
  <li>clean user-submitted content against a safelist, to prevent XSS
  <li>output tidy HTML
</ul>

<p>jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating,
  to invalid tag-soup; jsoup will create a sensible parse tree.</p>

<p>See <a href="https://jsoup.org/"><b>jsoup.org</b></a> for downloads, documentation, and examples.</p>

@author <a href="https://jonathanhedley.com/">Jonathan Hedley</a>

</body>
</html>
